<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
	<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=windows-1251">
	<TITLE>local_docs_create_index</TITLE>
	<META NAME="GENERATOR" CONTENT="OpenOffice.org 2.0  (Linux)">
	<META NAME="AUTHOR" CONTENT="Andrew">
	<META NAME="CREATED" CONTENT="20070320;18390000">
	<META NAME="CHANGEDBY" CONTENT="Michael">
	<META NAME="CHANGED" CONTENT="20070524;19130000">
	<STYLE TYPE="text/css">
	<!--
		@page { size: 8.27in 11.69in; margin-right: 0.59in; margin-top: 0.79in; margin-bottom: 0.79in }
		P { margin-bottom: 0.08in; direction: ltr; color: #000000; text-align: left; widows: 2; orphans: 2 }
		P.western { font-family: "Times New Roman", serif; font-size: 12pt; so-language: ru-RU }
		P.cjk { font-family: "Times New Roman", serif; font-size: 12pt; so-language:  }
		P.ctl { font-family: "Times New Roman", serif; font-size: 12pt; so-language: ar-SA }
		H3 { margin-bottom: 0.04in; direction: ltr; color: #000000; text-align: left; widows: 2; orphans: 2 }
		H3.western { font-family: "Arial", sans-serif; font-size: 13pt; so-language: ru-RU }
		H3.cjk { font-family: "Times New Roman", serif; font-size: 13pt; so-language:  }
		H3.ctl { font-family: "Arial", sans-serif; font-size: 13pt; so-language: ar-SA }
		A:link { color: #0000ff }
	-->
	</STYLE>
</HEAD>
<BODY LANG="ru-RU" TEXT="#000000" LINK="#0000ff" DIR="LTR">
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US">Install
Wikipedia, see file &quot;C:\Program
Files\edoc\docs\install-wikipedia.doc&rdquo; after installation.</SPAN></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US">Install
Java SE Development Kit
(</SPAN><FONT COLOR="#0000ff"><U><A HREF="http://java.sun.com/javase/downloads/index.jsp"><SPAN LANG="en-US">http://java.sun.com/javase/downloads/index.jsp</SPAN></A></U></FONT><SPAN LANG="en-US">)
to &quot;C:\Program Files\Java\jdk1.6.0&quot;</SPAN></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US">Install
Gate (</SPAN><FONT COLOR="#0000ff"><U><A HREF="http://gate.ac.uk/"><SPAN LANG="en-US">http://gate.ac.uk</SPAN></A></U></FONT><SPAN LANG="en-US">)
to &quot;C:\Program Files\GATE 3.1\&quot;</SPAN></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US">Install
RuPOSTagger (</SPAN><FONT COLOR="#0000ff"><U><A HREF="http://rupostagger.sourceforge.net/"><SPAN LANG="en-US">http://rupostagger.sourceforge.net</SPAN></A></U></FONT><SPAN LANG="en-US">)
in Linux or Cygwin.</SPAN></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>	Copy
&quot;C:\Program Files\edoc\gate_plugins\RussianPOSTagger&quot; to
&quot;C:\Program files\Gate 3.1\Plugins\RussianPOSTagger&quot;</FONT></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>Now,<BR>install
edoc to &quot;C:\Program Files\edoc&quot;</FONT></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>This
text could be found at &quot;C:\Program
Files\edoc\docs\install-readme.doc&quot; after installation.</FONT></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>Thank
you.</FONT></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3><I>After
installation instructions.</I></FONT></P>
<H3 LANG="en-US" CLASS="western"><FONT SIZE=3 STYLE="font-size: 13pt"><B>Index
creation</B></FONT></H3>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>Open
the following files in text editor in order to see the list of
parameters:</FONT></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=2><B>Table
1. English index</B></FONT></P>
<TABLE WIDTH=640 BORDER=1 BORDERCOLOR="#000000" CELLPADDING=7 CELLSPACING=0>
	<COL WIDTH=196>
	<COL WIDTH=201>
	<COL WIDTH=200>
	<TR VALIGN=TOP>
		<TD WIDTH=196>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Create \ Operation
			system</FONT></P>
		</TD>
		<TD WIDTH=201>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Windows</FONT></P>
		</TD>
		<TD WIDTH=200>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Unix</FONT></P>
		</TD>
	</TR>
	<TR VALIGN=TOP>
		<TD WIDTH=196>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Topic-based index</FONT></P>
		</TD>
		<TD WIDTH=201>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>create_index_by_wikipedia.bat</FONT></P>
		</TD>
		<TD WIDTH=200>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>create_index_by_wikipedia.sh</FONT></P>
		</TD>
	</TR>
	<TR VALIGN=TOP>
		<TD WIDTH=196>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Direct index (full
			text search)</FONT></P>
		</TD>
		<TD WIDTH=201>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>local_docs_create_index.bat</FONT></P>
		</TD>
		<TD WIDTH=200>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>local_docs_create_index.sh</FONT></P>
		</TD>
	</TR>
</TABLE>
<P CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=2><B>Table
2. Russian index</B></FONT></P>
<TABLE WIDTH=640 BORDER=1 BORDERCOLOR="#000000" CELLPADDING=7 CELLSPACING=0>
	<COL WIDTH=113>
	<COL WIDTH=265>
	<COL WIDTH=218>
	<TR VALIGN=TOP>
		<TD WIDTH=113>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Create \ Operation
			system</FONT></P>
		</TD>
		<TD WIDTH=265>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Windows</FONT></P>
		</TD>
		<TD WIDTH=218>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Unix</FONT></P>
		</TD>
	</TR>
	<TR VALIGN=TOP>
		<TD WIDTH=113>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Topic-based index</FONT></P>
		</TD>
		<TD WIDTH=265>
			<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>create_index_by_wikipedia_ru.bat
			<BR>(at Student: 192.168.0.29) or </FONT>
			</P>
			<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>create_index_by_wikipedia_ru_ghost.bat</FONT></P>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>(at Ghost:
			192.168.0.177)</FONT></P>
		</TD>
		<TD WIDTH=218>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>create_index_by_wikipedia_ru.sh</FONT></P>
		</TD>
	</TR>
	<TR VALIGN=TOP>
		<TD WIDTH=113>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>Direct index (full
			text search)</FONT></P>
		</TD>
		<TD WIDTH=265>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>local_docs_create_index_ru.bat</FONT></P>
		</TD>
		<TD WIDTH=218>
			<P LANG="en-US" CLASS="western"><FONT SIZE=3>local_docs_create_index_ru.sh</FONT></P>
		</TD>
	</TR>
</TABLE>
<P CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<H3 LANG="en-US" CLASS="western"><FONT SIZE=3 STYLE="font-size: 13pt"><B>Graph-based
search of relevant documents</B></FONT></H3>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>Graph-based
relevance calculation could be performed via web-service. </FONT>
</P>
<P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US"><SPAN STYLE="background: #ffff00">Setup
tomcat server</SPAN> (???).</SPAN></P>
<P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US">Copy
folder &quot;C:\Program Files\edoc\inetpub&quot; to web-server (e.g.
to IIS folder &quot;C:\inetpub&quot;), then run in Internet explorer:
</SPAN><FONT COLOR="#0000ff"><U><A HREF="http://cais/ksnet2/scripts/doc/show_documents.php"><SPAN LANG="en-US">http://cais/ksnet2/scripts/doc/show_documents.php</SPAN></A></U></FONT><SPAN LANG="en-US">.</SPAN></P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>Topic-based
index creation program (&quot;C:\Program
Files\edoc\create_index_by_wikipedia.bat&quot;) uses the following
list of parameters:</FONT></P>
<OL>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>onto_id
	&ndash; ontology identifier, e.g. 139</FONT></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>iiLangID
	&ndash; language identifier (0 &ndash;English, 1 &ndash; Russian)</FONT></P>
	<LI><P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US"><FONT SIZE=3><FONT COLOR="#000000">n_neighbour_classes
	&ndash; maximum number of similar classes and attributes (for each
	wiki category or article) to extract and store. <B>Default</B> 5.</FONT></FONT></SPAN></P>
	<LI><P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US"><FONT SIZE=3><FONT COLOR="#000000">k
	&ndash; weight coefficient in [0,1] (distance between (1) wikipedia
	article and categories and (2) ontology classes and attributes
	names). <B>Default</B> 0.5.</FONT></FONT></SPAN></P>
	<LI><P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US"><FONT SIZE=3><FONT COLOR="#000000">root_set_size
	&ndash; number of articles in the root set (search for authority
	pages by HITS algorithm), negative value means no limit. <B>Default</B>
	10.</FONT></FONT></SPAN></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>ksnetcontext_db
	&ndash; KSNetcontext database IP and title, e.g.,
	&quot;192.168.0.101:3306/ksnetcontext&rdquo; or
	&quot;192.168.0.101:3306/ksnetcontext_ru&rdquo;</FONT></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>ksnetcontext_user
	&ndash; KSNetcontext user name, e.g. &ldquo;michael&rdquo;</FONT></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>ksnetcontext_pass
	&ndash; KSNetcontext user password, e.g. &ldquo;12345&rdquo;</FONT></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>wiki_host
	&ndash; Wikipedia IP, e.g. &ldquo;192.168.0.29&rdquo;,</FONT></P>
	<LI><P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US"><FONT SIZE=3><FONT COLOR="#000000">wiki_db
	&ndash; Wikipedia database title and initialization string. This is
	due to bug with creation of database &ndash; wrong sequence,
	codepage, etc. In normal case the title is enough.<BR>e.g.
	&ldquo;<SPAN STYLE="background: #ffff00">en</SPAN>wiki?useUnicode=false&amp;characterEncoding=ISO8859_1&amp;autoReconnect=true&amp;useUnbufferedInput=false&rdquo;
	for
	English<BR>&ldquo;<SPAN STYLE="background: #ffff00">ru</SPAN>wiki?useUnicode=false&amp;characterEncoding=ISO8859_1&amp;autoReconnect=true&amp;useUnbufferedInput=false&rdquo;
	for Russian</FONT></FONT></SPAN></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>wiki_user
	&ndash; Wikipedia user name, e.g. &ldquo;javawiki&rdquo;. Can be
	different &ndash; up to decision of database administrator</FONT></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>wiki_pass
	&ndash; Wikipedia user password, e.g. &lsquo;&rsquo; (empty)</FONT></P>
	<LI><P CLASS="western" STYLE="margin-bottom: 0in"><SPAN LANG="en-US"><FONT SIZE=3><FONT COLOR="#000000">wiki_host_prefix
	&ndash; a prefix of wiki host, e.g. </FONT></FONT></SPAN><FONT COLOR="#0000ff"><U><A HREF="http://en.wikipedia.org/wiki/"><SPAN LANG="en-US"><FONT SIZE=3>http://en.wikipedia.org/wiki/</FONT></SPAN></A></U></FONT><SPAN LANG="en-US"><FONT SIZE=3><FONT COLOR="#000000">
	or </FONT></FONT></SPAN><FONT COLOR="#0000ff"><U><A HREF="http://ru.wikipedia.org/wiki/"><SPAN LANG="en-US"><FONT SIZE=3>http://ru.wikipedia.org/wiki/</FONT></SPAN></A></U></FONT></P>
	<LI><P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>n_limit_wiki_articles
	&ndash; number of Wikipedia articles to be indexed</FONT></P>
</OL>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><BR>
</P>
<P LANG="en-US" CLASS="western" STYLE="margin-bottom: 0in"><FONT SIZE=3>This
text could be found at &quot;C:\Program
Files\edoc\docs\install-readme.doc&quot; after installation.</FONT></P>
</BODY>
</HTML>